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Primordial non-Gaussianity introduces a scale-dependent variation in the clustering of density 
peaks corresponding to rare objects. This variation, parametrized by the bias, is investigated on 
scales where a linear perturbation theory is sufficiently accurate. The bias is obtained directly in 
real space by comparing the one- and two-point probability distributions of density fluctuations. 
We show that these distributions can be reconstructed using a bivariate Edgeworth series, presented 
here up to an arbitrarily high order. The Edgeworth formalism is shown to be well-suited for 'local' 
cubic-order non-Gaussianity parametrized by <7nl- We show that a strong scale-dependence in the 
bias can be produced by (?nl of order 10 5 , consistent with CMB constraints. On correlation length 
of ~ 100 Mpc, current constraints on #nl still allow the bias for the most massive clusters to be 
enhanced by 20 — 30% of the Gaussian value. We further examine the bias as a function of mass 
scale, and also explore the relationship between the clustering and the abundance of massive clusters 
in the presence of <?nl- We explain why the Edgeworth formalism, though technically challenging, 
is a very powerful technique for constraining high-order non-Gaussianity with large-scale structures. 



I. INTRODUCTION 

One of the most intriguing unanswered questions in cosmology is whether or not the primordial seeds that grew into 
large-scale structures observed today were laid down as a Gaussian random field. In the simplest single-field inflation 
model of the early Universe, the initial distribution of the primordial seeds, or density fluctuations, is expected to 
be very close to Gaussian [J Q , but deviations from Gaussianity may be large in more complex models involving 
multiple fields 0-Q or a non-canonical Lagrangian num. Therefore, a detection of a significant level of primordial 
non-Gaussianity is of great importance as it would effectively rule out a large class of single-field inflation and open 
an observational window to the early Universe. 

The observational signatures of primordial non-Gaussianity manifest across a large range of physical scales. On 
very large scales of order several gigaparsecs, non-Gaussianity can be detected, for instance, in the 3-point correlation 
function (bispcctrum) of the cosmic microwave background (CMB) anisotropics (see [H, El] for recent reviews). In 
the simplest setting in which the bispectrum is parametrized by the constant /nl, the prospect of constraining non- 
Gaussianity with the CMB seems very promising indeed. The Planck satellite 1 will most likely tighten the constraint 
on /nl to C(a few). On smaller scales, the distribution of galaxy clusters can provide competitive constraints on 
non-Gaussianity, which changes the abundances and clustering properties of large-scale structures (see [H], [H| and 
references therein). 

A particularly interesting large-scale-structure probe of non-Gaussianity was presented in the seminal work of Dalai 
et q?.[l6l|. who showed quantitatively that non-Gaussianity induces characteristic changes the clustering of density 
peaks corresponding to rare objects. Specifically, for a correlation length r, we can write 

£ P k(r)=6!(rK(r), (1) 

where £ p k denotes the correlation function of density peaks, £ is that of the underlying dark-matter distribution and 
bh is the bias parameter (these parameters will be explained in detail later). Physically, the bias quantifies how the 
density peaks traces of the underlying matter distribution. If the density fluctuations are Gaussian distributed, it can 
be shown that the bias is almost constant (i.e. scale-independent) to a good approximation 17]. The scale- dependence 
of the bias induced by non-Gaussianity is the focus of this work. 

Scale-dependent bias from non-Gaussianity is a relatively young but rapidly developing topic. Whilst the dependence 
of the bias on /nl was investigated in [l6j , a number of authors have since examined the bias for higher-order non- 
Gaussianity [13], non-local models [l!| and, more recently, scale-dependent /nl [13] amongst others. The focus of 
previous works in this area has been the calculation of the bias in Fourier space whilst relying on either numerical 
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simulations or some well-known mass functions. In this work, we show that it is possible to calculate the bias directly 
in real space by comparing the one- and two-point probability distribution functions (pdfs). 

We propose to reconstruct the pdfs by using the Edgeworth series in one and two variables (see [U H3] for 
reviews). The Edgeworth formalism is a mathematically powerful way to capture the statistical essence of non- 
Gaussian distributions. In previous astrophysical applications, the Edgeworth series were invariably heavily truncated 
[23M27I ] yielding pdfs that may not be well-defined, non-negative distributions. In this work, we give a general algorithm 
which allows the Edgeworth series to be kept to arbitrarily high order. 

We shall see later that given a limited amount of statistical information on the density fluctuations, the Edgeworth 
formalism is particularly well suited for the reconstruction of non-Gaussian distributions in which the cubic-order 
non-Gaussianity parameter, <?nLj is non-zero. This parameter will be the main focus of our calculations. Once well- 
defined pdfs are reconstructed, the information on the non-Gaussian bias can then be easily extracted from the one- 
and two-dimensional pdfs. 



II. THE PRIMORDIAL DENSITY FLUCTUATIONS 



We begin by introducing the necessary parameters which will allow us to describe the density fluctuations statisti- 
cally. 

Let p c , pb, p r , Pa be the time-dependent energy densities of cold dark matter, baryons, radiation and dark energy. 
Let p m = p c + pb- We define the density parameter for species i as 

n, s *^, (2) 

Pcrit 

where p cr ; t is the critical density defined by p CI1 t = 3Hq /8irG. The Hubble constant, Hq, is parametrized by the usual 
formula Hq = lOO/i km s~ Mpc . Results from a range of astrophysical observations are consistent with h ~ 0.7, 
tt c ~ 0.23, tt b - 0.046 and Q r ~ 8.6 x 10~ 5 , with fi A = 1 - Q m - fi r (see e.g. [HHl]). 
The density fluctuation field, 6, is defined at redshift z as 

6{x,z)= -— , 3 

\Pm\Z)) 

where (p m ) is the mean matter energy density. As we are mainly interested in the present-day value of 5, we shall 
drop the z-dependence in our notation and take 5 = S(z = 0). The Fourier decomposition of <5(x) is given by 

5(x) = / J0? 6{kyk ' x - (4) 

The gravitational Newtonian potential $ is related to the density fluctuation by the cosmological Poisson equation 

2 / k N " 



Statistical information on <5(x) can be deduced from that of S(k). However, due to the finite resolution of any 
observation, we can only empirically obtain information on the smoothed density field. Given a length scale i?, the 
smoothed density field, 6r, is given by 

5 R (k, z) = W(kR)T{k)5(k), (6) 
where k = |k|. We choose W to be the spherical top-hat function of radius R. In Fourier space, we have 

sin(fc-R) cos(kR) 



w(kR) = 3 ( k Rr 

It is also useful to define the mass of matter enclosed by the top-hat window as 

3 

3"" rm " \h~ l Mpc 



(7) 



4 /J? 
M = -TrR 3 p m Ps 1.16 x 10 12 ) h~ l M Q . (8) 



3 



We follow the approach outlined in [30| and use the transfer function T of Dicus 



T(x) = 



ln[l + (0.124a;) 2 
(0.124a;) 2 



1 + (1.257a;) 2 + (0.4452a;) 4 + (0.2197a;) e 
1 + (1.606a;) 2 + (0.8568a;) 4 + (0.3927a;) e 



1/2 



(9) 



In addition, we also incorporate the baryonic correction of Eisenstein and Hu [3l| , whereby the transfer function is 
evaluated at 



1/2 r 



Hnfl 



s l m 



1 - a 



-i -l 



1 + (0.43fcs) 4 



(10) 



with 



and 



a = 1 - 0.328 ln(431^ m /i 2 )^p- + 0.38 ln(22.3Q m h 2 ) 



\l<rn 



44.51n(9.83/0„/i 2 ) ^ 
s = — , Mpc. 

The matter power spectrum, P(k), can be defined via the two-point correlation function in Fourier space as 

(«(ki), <5(k 2 )) = (27r) 3 <J D (ki + k a )P(fc), (11) 

where (5^ is the 3-dimensional Dirac delta function. In linear perturbation theory, it is usually assumed that inflation 
laid down an initial spectrum of the form fc" s , where n s is the scalar spectral index (assumed to be 0.96 in this 
work). Physical processes which evolve P(k) through the various cosmological epochs can simply be condensed into 
the equation 



P(k) oc P^{k)T 2 (k), 

where P<p(k) oc fc™ s_4 . It is also common to define the dimensionless power spectrum V(k) as 



r(k) ^ ^pm ex 

27T Z 



k 



n B -l 



Consequently, the variance of density fluctuations smoothed on scale R can be written as 

dk 



where 



A(k, z) 



k 



k 



A\k,z)V(k), 



tt T(x EH )W(kR). 



3f2 TO \Hq 

In our numerical work, we shall normalise P(k) so that 

a s = a{R = 8/i _1 Mpc, z = 0) = 0.. 



(12) 



(13) 



(14) 



(15) 



(16) 



Finally, the correlation function £ is defined in real space as £(xi,X2) = (<5(xi), <5(x2)). If |xi — X2I = r, we can 
write 



dk 



A 2 (k,z)V(k)j (kr), 



(17) 



where jo(x) — sina;/a; (see e.g. |32|). In the limit that r — > 0, we recover the auto-correlation ([14 
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III. THE CLUSTERING OF DENSITY PEAKS 

The idea that the clustering of density peaks could be measured can be traced back to the pioneering work of Kaiser 
[It}. Let P\ be the probability that the overdensity at a randomly selected point is above some threshold S c , so that 

Pi = J p{x)dx, (18) 

where p{x) is the pdf for the overdensity. We shall take p(x) to be a weakly non-Gaussian distribution, which permits 
a valid Edgeworth expansion. This will be discussed in detail in the next section. We take S c — 1.686, corresponding 
to the threshold overdensity for spherical collapse. 

Density peaks tend to cluster, and therefore the occurrences of two density peaks are not independent random events. 
Indeed, the probability that the overdensities at two randomly selected points, separated by comoving distance r, 
both exceed S c is given by 

/•oo /*oo 

P'2 = I / p(x 1 ,x 2 )dx 1 dx 2 , (19) 
where p{x\,X2) is the joint pdf. The density-peak correlation function £ p k can be defined as 

Uto = % - 1- (20) 

Note that £ p k = if any two density peaks occur independently. 

The bias parameter, b^, in Lagrangian coordinates is defined as the ratio 

bl = ^ (2D 

which quantifies the amplitude at which density peaks trace the underlying matter distribution. At late time, what 
is observable is the Eulerian bias, b, 

b=l + b L . (22) 
If the underlying distribution of 5 were Gaussian, it is well known that in the limit 8 C / ctr 3> 1 [13] 

^Gaussian ~ 1 ~\~ 2" (23) 

which is scale-independent to a good approximation. Our goal is to quantify the variation in b induced by non- 
Gaussianity. 

IV. THE EDGEWORTH SERIES 

Equation (|2"0j) shows that it is possible to calculate the bias directly once the probability distribution p(S) and the 
joint distribution p(5\,d2) are known. In this section, we shall explain how these distributions can be reconstructed 
from a few lowest-order moments of the distribution. This technique involves the Edgeworth series, which has been 
explored by previous authors in simpler forms [23M27I l33j . The Edgeworth series can be summarised schematically as 

Non-Gaussian pdf = Gaussian x (1 + deviation), (24) 

where the deviation comprises all known moments of the distribution. In what follows, we define the normalised 
overdensity as 

»=^, (25) 

OR 



so that (v) = 1. 
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Univariate series 



We shall use the form of the univariate Edgeworth series given by Petrov [34J , who developed a method for calculating 
the series to arbitrarily high order. Given a non-Gaussian pdf with zero mean and variance a R , we can express its 
deviation from Gaussianity as a power series in a R : 



V {y) = N(y) 



1 



where N(y) is the normal distribution 



N(u) 



1 



(7rV 2tt 

and the coefficients E s in the power series are given by 



ex P TT 



{k m } L 



H 



s+2r 



'm+2 



^ k m \ \ {m + 2y. 



(26) 



(27) 



(28) 



We now explain the various components of the coefficient (f2"5|) . Firstly, the sum is taken over all distinct sets of 
non-negative integers {k m }^ n=1 satisfying the Diophantine equation 

k x +2k 2 + ■■■ + sk a = s. (29) 

We also define 

r = ki + k 2 + . . . + k s . (30) 
Secondly, the function H n (i>) is the Hermite polynomial of degree n. They can be defined by the Rodrigues' formula 

H n {v) = (-l) n e» 2/2 ^ (e^ 2 / 2 ) • (31) 
For example, Hq(v) = 1 and H\(v) — v. Higher order polynomials can be easily obtained via the recurrence relation 



H n+ i{v) = vH n {v) - nH n -i{v) 
Thirdly, the reduced cumulants, S n , is defined by 



(32) 



(33) 



where (<5^) c is the nth cumulant. For a distribution with zero mean, the relationships between the first few cumulants 
and moments are 



(34) 



(6 R ) C = 0, (5 2 B ) C = 4, 

(4) c = (s%), (4)c = (4)-34- 

Note that if p{v) is Gaussian, the cumulants of order > 3 vanish identically, and so do the expansion coefficients 
as one might expect. 

Throughout this work we shall often make references to the skewness and kurtosis, which are defined respectively 
as (5 jf) j 1 u\ and {5 R )/(J R . The excess kurtosis is defined as as (S R )/cr R — 3, with 3 being the kurtosis of the Gaussian 
distribution. 



B. Bivariate series 

The bivariate Edgeworth series appeared in astrophysical contexts in [35l - f37j . although in these works the series 
was truncated at low order and resembles a bivariate Gram-Charlier series (see [21( for detail of the distinction). In 
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TABLE I: The partitions and bipartitions for the integer 3. 



Partition [eq. d38ll] 


Bipartition [eq. (13911] 


[3] 


(50), (41), (32), (23), (14), (05) 


[21] 


(40)(30),(40)(21),(40)(12),(40)(03), 
(31)(30),(31)(21),(31)(12),(31)(03), 
(22)(30),(22)(21),(22)(12),(22)(03), 
(13)(30),(13)(21),(13)(12),(13)(03), 
(04) (30) , (04) (21) ,(04) (12) , (04) (03) 


[111] 


(30) 3 ,(21) 3 ,(12) 3 ,(03) 3 , 

(30) 2 (21),(30) 2 (12),(30) 2 (03), 

(21) 2 (30),(21) 2 (12),(21) 2 (03), 

(12) 2 (30),(12) 2 (21),(12) 2 (03), 

(03) 2 (30),(03) 2 (21),(03) 2 (12), 

(30)(21)(12), (30)(21)(03), (30)(12)(03), (21)(12)(03) 



[38j and [39j | . the authors presented a bivariate Edgeworth series expanded to an arbitrary number of terms. In this 
form, the series is given by 



p(v, v') = N(y, v 1 ) 



i+EE E p M 

S=l{Pm}{p«,<?i,7T 4 } 



(35) 



where v and v' are normalised overdensities smoothed on the same scale. The bivariate Gaussian distribution N(v, v 1 ) 
is given by 



N(p,u') = 



1 



exp 



v l - Ipvv 1 + v' 2 
2(1 -P 2 ) 



where p is the normalized correlation 



(36) 



(37) 



Given an integer s, the second sum in (|35|) is taken over all distinct sets of positive integers {-P m }L=i satisfying the 
partition conditions 



P!+P 2 + ...+Pi = S, 

Pi > Pi > ■ ■ ■ > Pi > 0. 



(38) 



For a given partition {P m }m=i, the third sum is taken over all distinct sets of non-negative integers (pi, qi) satisfying 
the bipartition condition 



Pi + q t =Pi+2. 
If (pi, qi) appears 7ti times in the bipartition, we write 



(39) 



[P1P2 ...Pt] = [(puqiV 1 (P2, Q2V 2 ■ ■ ■ (PJ, qjY J ] with £ ir t = I. 



(40) 



As an example, the partitions and bipartitions for the integer 3 are given in Table [TJ The number of partitions and 
bipartitions for integers up to 6 are shown in Table ITT] 
For each unique bipartition, the function F is given by 



F{y,v') 



n 



1 /A 



7T;! \Pi l -q%l 



#p,q (V, V') , 



(41) 



p = E ft7Fi ' q = E ft7ri - 



7 



TABLE II: The number of partitions and bipartitions for some integers. 



Integer 


^partitions 


^bipartitions 


1 


1 


4 


2 


2 


15 


3 


3 


46 


4 


5 


131 


5 


7 


342 


6 


11 


851 



Here H Piq denotes the bivariate Hermite polynomial defined analogous to (f3~T|) as 

# („ v') = W „ n a N(v,i/), 



(42) 



iV(i/, i/') = exp I - 



i^ 2 - 2/W + i/ 2 



2(1 -p2) 

In the Appendix, we outline how H p q (v, v') can be efficiently computed. The coefficient A Pi9 is defined as 



A„,,(r) 



a p+q 



(43) 



where (5' = <5(x'). In other words, A p g is the connected part of the correlation between 5 P and S' q . We shall refer to 
(S p 6' q ) c as a joint cumulant (typically there would be a number of joint cumulants of the same order). Similarly, we 
speak of a joint skewness in the case p + q = 3, or a joint kurtosis when p + q — 4. 

Finally, note that F contains information on the cumulants of order 3 and higher. One also easily checks that ([33)) 
reduces to the bivariate Gaussian distribution when F = 0. 



V. CUMULANTS AND LOCAL NON-GAUSSIANITY 



The previous section established the ingredients necessary for the reconstruction of the non-Gaussian pdfs in one 
and two variables via the Edgeworth series. It is useful to connect these ingredients (which consist of cumulants of 
the distributions) to a more familiar measure of non-Gaussianity, for example, the parameters /nl and <7nl- 

The most widely studied type of non-Gaussianity is the 'local' type parametrized, at lowest orders, by /nl and (?nl, 
which are the coefficients in the Taylor expansion of the non-linear Newtonian potential, in terms of the linear, 
Gaussian field, 4>, 

$(x) - 0(x) + /nl (0 2 (x) - (0 2 )) + . 9 nl0 3 (x) + . . . . (44) 

We adopt the 'large-scale-structure' convention in which $ is extrapolated to z = 0. We also take /nl and (?nl to 
be constant, although it is conceivable that they may be scale-dependent. In this section, we shall calculate the joint 
skewness and kurtosis as a function of /nl and #nl (see [13, HH for previous treatments of the joint cumulants) . 



A. Joint skewness 



We loosely take joint skewness to mean a family of correlations comprising the following quantities 

(<5 3 ) c , (Oo (S 2 S') C , (65' 2 ) c . (45) 

The first two quantities are equal to the one-point cumulant a 4 S3. It is worth emphasising the subtle difference 
between S3 and A3 t o 

A 3 ,o = <JS 3 . (46) 
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The remaining two correlations in (I45[) equal 



r dk t dk' 

<7 3 A li2 (r)=2/ NL J j^J —A(k)A(k')A(\k + k'\)P^k)P^k>) 



1 



, iW|k + k'|) 



ol (k+k')-r 



(47) 



where r = |r| = 1x2 — xi| [40(. This expression cannot be analytically evaluated without significant approximations 
as was done in 24 , |35| . In this work, we numerically evaluate the joint cumulants directly by a simple change of 
coordinates. In ((47]), one can align r along the z-axis and introduce spherical coordinates to arrive at 



2tt 



d(f>i A(k u ) 



where fci2 
and 0;,, 



(kl 



2k 1 k 2 Q 1 



.1/2 



03 ) +W 



1/2 



Note that we can obtain A3 t o by simply evaluating Ai.2(0). 



B. Joint kurtosis 



= ir(fci^i+fe 2 A I 2) 



(48) 

(49) 
(50) 



Joint kurtosis refers to three quantities, namely, (S 4 ) c , (S 3 S') C and (5 2 8' 2 ) c . Again, it is worth pointing out that 

and that A^o may be obtained from the other 2-point correlations via the relations 

A 4 ,o = A 3 ,i(0) = A a , a (0). 
A change of coordinates again yields the integral expressions for these correlations, 

dk. 



(51) 
(52) 



<r 4 A 3 ,i(r) = 



n 

i=l 
3 

3 i 3NL 



a 4 A 2 ,(r) 



32tt 

(n 

i 

32^ 



_i JO 



dki 



A(ki)V(ki) 



1 ,2tT \ 

d^J A(fe 4 ) e f(*iMi+fc2M2 + fe3M3) x 



(53) 



1 /.27T 
d^i / d 

1 Jo 



A(k 4 ) e ir ( fcl ^ 1+fc2 ^) x 



3ffNL 



P^ki) P^ki) 



p^h) P4k 3 ) 



where fc 4 = (fc 2 + k\ + fc 2 + 2k x k 2 Q 12 + 2k 2 k 3 Q 23 + 2k 1 k 3 Q 13 ) 



1/2 



(54) 
(55) 



Here I and J are the contributions of /nl to the 4-point correlations. The forms of these contributions depend on 
the symmetries in the integrals above. One can show that 



J (3) ' 



(2X4) 

(2) 1 (i)(3)y 

(3)(4)\ , (12) + (23) 



(1)(2) 



(2) 



1 



(2)(4) 

(i)(3)y 



(56) 



(57) 



where we have used the shorthand (1) = P^ikx) and (23) = P4,(k 23 ) etc. 2 . Since I and J blow up whenever fci2, £23 
or fci 3 vanishes, it is necessary to introduce a large-scale cut-off to evaluate these integrals. To avoid sources of errors 
associated with this cut-off, we shall only consider the case in which /nl = 0. 



2 Setting /nl = in equations Il53|l - ll54ll . we recover (A5)-(A6) of [lSl |. The latter then proceeded with large-scale approximations in 
Fourier space whereas we have not. 
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g NL = -1 7 Gaussian g NL =10 7 




0.05 0.1 0.15 0.2 0.25 



FIG. 1: The joint pdfs p(5i,S2) corresponding to (left to right) g^h = — 10 7 , and 10 7 (/nl — 0), reconstructed using the 
bivariate Edgeworth expansion of order 4, with R = 8/i -1 Mpc. The horizontal bar gives the colour code for the probability 
value on the square grid [—3, 3] 2 . The distributions peak more sharply with increasing <?nl- Large values of <;nl have been 
used for illustrative purposes. 



Figure [T] shows the joint pdfs with c/nl = — 10 7 , and 10 7 (/nl = 0) reconstructed using the bivariate Edgeworth 
expansion of order 4. We have chosen large values of c/nl to visually illustrate the effect of <?nl on the joint pdf 
(namely, the increase in the sharpness of the peak as <?nl increases) . 



VI. POSITIVITY OF THE RECONSTRUCTED PDFS 



Since the reconstructed pdf will be used to calculate the abundance and the bias of large-scale structures, it is 
important that the pdf obtained via the Edgeworth series is positive definite. 

In general, the positivity of the Edgeworth series is difficult to maintain. As far as we are aware, there exists no 
general prescriptions that guarantee the positivity of the bivariate Edgeworth series (see [33( for the analysis of the 
univariate series). Our investigation shows that the joint pdf tends to develop negative regions whenever the univariate 
pdf does. For fourth-order series used in this paper, the combinations of 5*3 and 5*4 that yield a non-negative pdf are 
shown in Figured! For /nl = 0, this corresponds to <7nl in the range < #nl < 10 s . For #nl outside this range, 
the reconstructed pdf can develop regions in which p < 0. This range of validity is well within the observational 
constraints on <?nl (at 2a): 



- 5.6 x 10 5 < ffN L < 6.4 x 10 5 , (Vielva and Sanz |4l|), 

-7.4 x 10 5 < ffN L < 8.2 x 10 5 , (Smidt et al. 0), 

-3.5 x 10 5 < <?nl < 8.2 x 10 5 , (Desjacques and Seljak [TJ]). 

In Section I Villi we shall discuss whether it is possible to extend the range of validity of the Edgeworth series to 
include extreme values of $nl ■ 



VII. SCALE-DEPENDENT BIAS INDUCED BY 3NL 



Using the results in the previous sections, we are now ready to calculate the bias shift induced by #nl- We summarise 
the main steps and technical details below. 
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FIG. 2: Validity of the 4th-order Edgeworth expansion (|26[l . The shaded region corresponds to the combinations of S3 and S4 
for which there exists a non-negative pdf. On cluster scales where a ~ 1, this corresponds to |/nlI 10 3 an d < (?nl < 10 s - 
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FIG. 3: Fractional change in the joint probability P2 (Eq. I19[) as the order of the bivariate Edgeworth expansion increases 
from 4 to 6, plotted against correlation scales (with <?nl = 10 6 ). The change is less than 0.01 percent, showing that the 
expansion (|35|l is not highly sensitive to the truncation. 



1. For a given value of <7nl, we calculate the one and two-point cumulants using (|48|) . ([53]) and (|54|) for a range of 
values of correlation length r. We only consider the case /nl = to avoid additional errors from the infrared 
cut-off in the integrals (|53 |) -(f54 |) . We initially perform this step at a fixed smoothing scale R = 8/i _1 Mpc (the 
dependence on R will be investigated shortly). 

2. The cumulants are then used to reconstruct the univariate and bivariate pdfs using the Edgeworth expansions 
([251 and (ESD of order 4. 

3. The reconstructed pdfs are checked to ensure that they are non-negative. For the univariate pdf, this is satisfied 
when (?nl is in the range [0, 10 8 ]. For these values the bivariate pdfs were also found to be non-negative. 

4. Finally, the pdfs are integrated and combined to give the bias b as described in Section ITO1 

It is worth investigating whether the bias is sensitive to the order at which the bivariate series is truncated. First, 
note that increasing the expansion to fifth-order expansion results in no change in the bivariate pdf (since we have 
assumed that the odd joint cumulants vanish). Figure [3] shows the fractional change in the joint probability Pi (Eq. 
[T9]) expressed as the ratio |P2(6th order)/^ (4th order) — 1| with ^nl = 10 6 . We see that the change is less than 0.01 
percent over the range of scales of interest. Thus, we conclude that the bivariate expansion is not strongly sensitive to 
the truncation order. This is generally observed for other values of <?nl- Considering this modest increase in accuracy 
at the price of a tremendous increase in the run-time of the code, we find that the 4th order bivariate expansion is 
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FIG. 4: The effects of <?nl on the bias b as a function of correlation length r. The upper panel shows b(r) for various values 
°f 5NL- The lower panel shows the ratio between the non- Gaussian and Gaussian biases. These curves were calculated at 
smoothing scale R = 8 h~ 1 Mpc, using fourth-order Edgeworth expansions. See the text for more discussion. 

adequate for our current investigation 3 . 

A. Results 

Figure 2] shows the effects of non-Gaussianity on the bias with ^nl up to 10 6 , using the smoothing scale R = 8 
/i~ 1 Mpc (corresponding to objects of mass ~ 10 13 /i _1 M Q ). The bias is plotted as a function of correlation length of 
up to ~ 100 /i _1 Mpc (a typical inter-cluster distance). This is the main result of our work. Note that in the Gaussian 
case, b is constant on sufficiently large scales to a good approximation. 

In general, we observe that large <?nl enhances the clustering of objects on the largest correlation scales. A significant 
enhancement in the bias can be observed on scales of around 80 Mpc and beyond, consistent with the results of the 
numerical simulations in [l8[. For 9nl = 5 x 10 5 (saturating the CMB constraint) the bias is enhanced by as much as 
20 - 30% at correlation length of ~ 100 fr^Mpc. 
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3 For detail of the sensitivity of the univariate series to the order of truncation, see [33H 
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FIG. 5: The effects of g^h on the bias as a function of the mass scale M for a fixed correlation length r = 50/i _1 Mpc (left) 
and 100/i -1 Mpc (right). In each figure, the upper panel shows the bias b(M) for a range of values of <?nLj whilst the lower 
panel shows the ratio of the non-Gaussian and Gaussian biases. See the text for more discussion. 



The bias for </nl = — 10 5 and —5 x 10 5 in |4] are included for comparison but should be regarded with caution. 
As described earlier, the reconstructed pdfs are not positive definite in these cases due to the lack of information 
on higher-order moments. Nevertheless, we see the general trend that a negative c/nl can significantly suppress the 
clustering of density peaks. 

B. Dependence on the mass scale 

We now consider the non-Gaussian bias when the smoothing scale R, or, equivalently, mass scale M, varies while 
keeping the correlation length fixed. This is useful in determining the effects of non-Gaussianity on the clustering 
of structures of varying masses for a given correlation length. Figure [5] summarises these effects for r — 50 and 100 
/i _1 Mpc. In each panel, the bias is plotted as a function of mass scale (M < 10 16 Mq). In addition, we impose the 
constraint r > 3i? to avoid nonlinear effects that emerge when the smoothing and correlation scales are comparable. 

We observe a monotonic increase in b as R increases, although this dependence is generally weak for a wide range of 
correlation scales. The monotonic increase in &/&Gaussian is observed for smoothing scales R well above the correlation 
length. The change in curvature seen in the lower panel on the right for 9nl = 10 6 is most likely a symptom of 
nonlinear effects as R ~ r, and a gradual breakdown of the 4th-order expansion. 

At large correlation lengths and in the presence of large <?nl, we observe a noticeable enhancement in the bias. For 
example, at r ~ 100 Mpc, the bias for the most massive clusters (M ~ a few x 10 15 M Q ) is enhanced by 20 — 30% 
with <7nl = 5 x 10 5 . For a shorter correlation length of order a few x 10 Mpc, <7nl introduces only a sub-percent 
enhancement in the bias. 



C. Clustering versus abundance 

The two main manifestations of non-Gaussianity in the distribution of large-scale structures are in the abundance 
and the clustering of rare objects. These effects for (?nl are displayed in Figure [51 which shows the bias as a function 
of the differential abundance 
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FIG. 6: The effect of g^L on the clustering-abundance relationship for rare objects with correlation length r = lOO/i 1 Mpc. 



Left: The bias as a function of differential abundance dn/dM. 
rarest clusters) to 10 13 AfQ. 



Masses of objects in this range vary from ~ 5 x 10 M© (the 



Sin**"*- (58) 

where n(M) is the number density of objects of mass M and p{x, M) is the pdf smoothed by a window function 
containing mass M. On the horizontal axis, the range of masses varies from 1O 16 M (the rarest clusters) down to 
10 13 M Q (a typical galaxy group). Again, we see the general trend that both the bias and the abundance of massive 
clusters increase with <7nl (see e.g. [331 ] for detailed calculation of the abundance). The non-Gaussian effects are more 
pronounced for rarer, more massive clusters. 



VIII. POSITIVITY OF THE PDF BY SQUARE- WEIGHTING 



Given moments up to order 4 of the distribution of large-scale structures, we have shown that it is possible to 
construct positive-definite pdfs (in both one and two variables) for (jnl in the range [0, 10 s ] . For the technique to be 
applicable for <?nl outside this range, higher-order moments must be known. A similar conclusion can be drawn for 
the case of purely /NL-type non-Gaussianity (with §nl = 0). 

The positivity of the Edgeworth series is a long-standing problem which is not easily overcome. An interesting 
solution sometimes employed in the economics literature is the square- weighting and renormalisation of the Edgeworth 
series 14314451. For instance, one could take 



p{v) 



for the univariate series, and similarly, 



Ci 



lEsiy) 



(59) 



p{v, v') = 



N(u, v') 
C 2 



s=l{F m }{Pi,9»,7r 4 } 



(60) 



for the bivariate series. Here C\, C2 are constants that renormalise the pdf in each case (note that for the Gaussian 
case, C\ = C2 = 1). 

We have experimented with the square- weighting and found the method to be unsatisfactory. For instance, we found 
numerical artefacts such as oscillations in the bias due solely to the square- weighting, and are therefore unphysical. 
This is not surprising because the square- weighting changes the statistical information of the distribution significantly, 
and thus the results are difficult to interpret. In addition, there is an order-of-magnitude increase in computing time 
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FIG. 7: The joint pdfs p(5i,<52) corresponding to (left to right) /nl — 10 3 , and 10 3 (gNL — 0), reconstructed using the 
square-weighted bivariate Edgeworth expansion of order 4, with R = 8/i -1 Mpc and r = 100ft _1 Mpc. The horizontal bar gives 
the colour code for the probability value on the square grid [—3, 3] 2 . The peak is skewed to the right (towards the first 
quadrant of the x — y plane) for /nl < 0, and left (towards fourth quadrant) for /nl > 0. 

due to the renormalisation at every time step. Therefore, until further analyses of this sort of square- weighting are 
performed, we cannot recommend this technique at this point. Nevertheless, for illustrative purposes, we display the 
reconstructed square- weighted pdf in Figure [71 in which large values of positive and negative /nl skew the pdfs (which 
are positive definite) in opposite directions as expected. 

IX. CONCLUSIONS 

In this work, we have demonstrated an alternative method of calculating the bias in the clustering of rare objects 
in the presence of primordial non-Gaussianity. Our method is based on the reconstruction of the pdf of density 
fluctuations using the Edgeworth series in one and two variables. The bias obtained in this way is in real space, in 
contrast with previous works that examined the scale-dependence bias in Fourier space. 

A step-by-step guide to our method is presented in Section IVIIl Some of the expressions involved (e.g. (|35p ) may 
seem complicated, but this is because they incorporate information on arbitrarily high-order correlations. As long 
as estimates on these high-order correlations are available, our formalism can, in principle, be used to study the 
observable signatures of high-order non-Gaussianity. In addition, the reconstruction algorithm is independent of the 
form of non-Gaussianity, hence making our method easily applicable to non-local forms of non-Gaussianity as well. 

The Edgeworth formalism is a powerful technique that captures all the statistical information of a probability 
distribution. However, previous astrophysical applications generally dealt with the lowest-order expansions, and 
therefore the reconstructed pdfs were often found not to be positive definite (in fact, at the lowest order the univariate 
pdf can never be positive definite). Results obtained from working with pdfs that are not positive definite are 
unreliable, especially in the context of large-scale structures which are particularly sensitive to the tail end of the pdf. 

In this work, we concentrate on the case of non-Gaussianity parametrized by positive <7nl, which yields pdfs (both 
uni- and bivariate) that are positive definite. It may be surprising to some that the Edgeworth formalism is more 
easily applied to the case with <?nl ^ rather than the case with purely /NL-type non-Gaussianity. The reason is 
that at leading order, /nl corresponds to the skewness of the distribution. As shown in our previous work [331 ] . this 
information alone cannot define a non-negative pdf. Our previous work also showed that the Edgeworth formalism for 
the case of pure /nl requires the knowledge of moments of order at least 5, for which there exist some observational 
constraints [4^, [43]. The results for /nl are expected to be similar to that of <?nl- This degeneracy can, in theory, 
be broken by comparing the statistics of voids with that of massive clusters, as any asymmetry in the pdf must be 
due to the presence of odd-order cumulants. In practice, however, there is the obvious difficulty of determining the 
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abundance and clustering properties of voids. See [H, |H, |49[ for recent progress. 

Our main results show that <7NL _ type non-Gaussianity can significantly affect the clustering of massive clusters on 
large correlation scales (~ 100 Mpc, typical of inter-cluster distances). A strong scale dependence of the bias can 
be seen in Figure ([3]), which summarises our main results for ^nl up to 10 6 . It appears that current constraints 
on <7nl still allow the bias for the most massive clusters to be enhanced by 20 — 30% of the Gaussian value. Our 
findings are relevant to observations and iV-body simulations in which the clustering of extremely massive objects 
are seen (50j . An interesting extension of this work is, therefore, a pdf reconstruction using moments observed in 
large surveys and simulations. It would then be important to include finite-volume effects (Ell . [52| which have been 
shown to systematically alter the cumulants and hence introduce spurious non-Gaussian effects. By using high-order 
moments and including finite- volume corrections, we expect to be able to extend the Edgeworth formalism to probe 
a much wider range of high-order non-Gaussianity. This is the subject of our future work. 
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Appendix A: Bivariate Hermite polynomials 

The bivariate Hermite polynomial H n . m is defined via the differential equation (|42[) . In this Appendix, we outline 
a technique which allows H (n, m) to be evaluated efficiently. 

Firstly, we assume m > n, otherwise one may appeal to the identity 

H n , m {x,y) = H„^ n (y,x), (Al) 

which can be deduced from (|42l) . The numerical value of H mn {x,y) can be computed using the recurrence relation 
first obtained by Hermite himself [53j 



1 

1 ~p 2 



H n>m+ i(x, y) = 2 [(y ~ px)H n ,m{x, y) + pnH n -^ m (x,y) - mH n ,m-i{x,y)] , n,m > 1. (A2) 



This recurrence requires the knowledge of -ffi.i and -ffo,m for m > 0. It is straightforward to evaluate -f/i.i directly 
from (|42|) . giving 



fT , (V ~ px)(x - py) p 

ff i,i(*. v) = — (i- P 2)2 — + ~r~p2' (A3) 



For i?o,m, a simple change of variable gives 

H , m (x, V) = (l- pT m/2 H m (^^= ) , (A4) 
where H m is the standard Hermite polynomial. 



V^~p 2 
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